Semi-supervised Dependency Parsing using Lexical Affinities
نویسندگان
چکیده
Treebanks are not large enough to reliably model precise lexical phenomena. This deficiency provokes attachment errors in the parsers trained on such data. We propose in this paper to compute lexical affinities, on large corpora, for specific lexico-syntactic configurations that are hard to disambiguate and introduce the new information in a parser. Experiments on the French Treebank showed a relative decrease of the error rate of 7.1% Labeled Accuracy Score yielding the best parsing results on this treebank.
منابع مشابه
Title of Thesis: Learning Structured Classifiers for Statistical Dependency Parsing Learning Structured Classifiers for Statistical Dependency Parsing
In this thesis, I present three supervised and one semi-supervised machine learning approach for improving statistical natural language dependency parsing. I first introduce a generative approach that uses a strictly lexicalised parsing model where all the parameters are based on words, without using any part-of-speech (POS) tags or grammatical categories. Then I present an improved large margi...
متن کاملSimple Semi-supervised Dependency Parsing
We present a simple and effective semisupervised method for training dependency parsers. We focus on the problem of lexical representation, introducing features that incorporate word clusters derived from a large unannotated corpus. We demonstrate the effectiveness of our approach in a series of dependency parsing experiments on the Penn Treebank, and we show that our clusterbased features yiel...
متن کاملDependency Parsing: Past, Present, and Future
Dependency parsing has gained more and more interest in natural language processing in recent years due to its simplicity and general applicability for diverse languages. The international conference of computational natural language learning (CoNLL) has organized shared tasks on multilingual dependency parsing successively from 2006 to 2009, which leads to extensive progress on dependency pars...
متن کاملImproved CCG Parsing with Semi-supervised Supertagging
Current supervised parsers are limited by the size of their labelled training data, making improving them with unlabelled data an important goal. We show how a state-of-theart CCG parser can be enhanced, by predicting lexical categories using unsupervised vector-space embeddings of words. The use of word embeddings enables our model to better generalize from the labelled data, and allows us to ...
متن کاملSimple Semi-supervised Dependency Parsing
We present a simple and effective semisupervised method for training dependency parsers. We focus on the problem of lexical representation, introducing features that incorporate word clusters derived from a large unannotated corpus. We demonstrate the effectiveness of the approach in a series of dependency parsing experiments on the Penn Treebank and Prague Dependency Treebank, and we show that...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012